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ABSTRACT 

A  sequential  search  procedure  for  maximization  of  a  single  variable  multimodal 
objective  function  is  designed  and  investigated  in  this  research.    Existing  sequential 
procedures  require  the  function  to  be  unimodal.    Nonsequential  methods,  though  not 
restricted  in  this  sense,  require  a  large  number  of  samples.    Results  show  that  the  pro- 
posed sequential  method  is  in  this  case  preferable. 
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I.     INTRODUCTION 

A.     PURPOSE 

The  purpose  of  this  research  is  to  design  a  sequential  search  procedure  capable 
of  estimating  and  locating  the  global  maximum  of  a  single  variable,  multimodal 
objective  function  with  a  predetermined  accuracy.    The  procedure  will  be  derived 
in  the  form  of  an  algorithm  such  that  it  can  be  easily  implemented  by  a  digital 
computer. 
B.     LITERATURE  SURVEY 

A  maximization  problem  for  a  digital  computer  provides  an  algorithm  by  which 
the  objective  function  is  evaluated  or  sampled  for  various  values  of  its  argument 
(sampling  points).    Such  a  procedure  must  prescribe  a  rule  to  choose  the  sampling 
points  (to  be  called  the  sample  rule)  and  a  function  (to  be  called  the  estimate) 
depending,  in  general,  on  all  samples  obtained  and  approximating  the  unknown  value 
of  the  maximum,  Shubert  and  Spang  [Refs.   10  and  11].    The  sampling  rule  divides 
these  procedures  into  two  general  categories:    sequential  and  nonsequential.    In  a 
sequential  procedure  the  sampling  rule  utilizes  the  values  of  previous  samples  to 
determine  the  location  of  the  next  sampling  point.    The  procedure  is  called  non- 
sequential if  t  he  sampling  points  are  chosen,  possibly  by  some  random  mechanism, 
in  advance  of  any  computation  or  experimentation. 

Hill,  Spang  and  Wilde  [Refs.  4,   11,  and  13]  have  further  subdivided  these  pro- 
cedures into  three  basic  groups:    (1)  gradient  methods;  (2)  sequential  min-max  methods; 
and  (3)  random  and  grid  search  methods. 

1.    Gradient  Methods 

The  largest  body  of  material  in  the  literature  is  concerned  with  what  has 
been  referred  to  as  "gradient  methods"  of  maximization.    These  methods  utilize  the 
"hill-climbing"  principle  to  determine  the  direction  in  which  the  objective  function 
increases,  i.e.,  measurements  of  the  slope  of  the  function  are  used  as  an  indication 
of  the  direction  toward  the  maximum.    A  general  approach  to  gradient  methods  is 
found  in  Spang  [Ref.  11],    The  first  sampling  point  is  selected  arbitrarily  and  depends 
primarily  on  the  experimenter's  subjective  opinion  as  to  the  location  of  the  maximum. 
If  the  objective  function  is  such  that  the  approximate  location  of  the  maximum  cannot 
be  determined  in  a  subjective  manner,  the  midpoint  of  the  experimental  region  may  be 
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used  as  the  "initial  guess",  Brooks  [Ref.  2],  where  the  experimental  region  is 
defined  as  the  interval  containing  all  possible  sampling  points  for  which  the  ob- 
jective function  is  defined.    Computing  time  will  be  significantly  reduced  the 
closer  the  initial  sampling  point  is  to  the  true  abscissa  of  the  maximum.    The 
remaining  sampling  points  are  determined  by  the  iterative  equation: 

x  =x     +h     D  (1) 

n  +  1         n        n     n 

where  x      is  the  value  of  the  sampling  point  at  the  n  th     iteration,  h     is  a  positive 
n  n 

constant,  and  D     is  the  direction  vector  evaluated  at  the  n  th     iteration.    For  a 
n 

single  variable  objective  function,   D     is,  of  course,  a  scalar.     Thus,  if  D    is  posi- 
»  i  n  n 

tive  the  (n  +  1)  st  sampling  point  will  be  to  the  right  of  the  n  th    sampling  point, 

the  reverse  will  hold  if  D     is  negative.    The  magnitude  of  h     D    determines  how 

n  n      n 

large  a  step  is  taken  in  the  direction  specified  by  D    .    The  iterations  continue 
until 

h    D    *   e  (2) 

n     n 

where  e  >    0    is  the  desired  accuracy  in  estimating  the  true  location  of  the  maximum. 

The  inequality  given  by  (2)  is  considered  the  stopping  rule  for  this  method.     If  the 

inequality  is  satisfied  then  x     is  the  estimate  for  the  abscissa  of  the  maximum  and 

f(x    )  becomes  the  estimate  for  the  maximum.     If  the  inequality  is  not  satisfied,  then 

the  procedure  goes  to  (1).    Although  the  various  gradient  methods  differ  in  their 

choice  of  scale  factor  h     and  direction  vector  D   ,  the  general  approach  to  the 

n  n 

problem  is  the  same. 

For  example,  Spang  [Ref.   11]  uses  as  a  choice  of  the  direction  vector  D 


df 


(3) 
n 


dx 

where  (3)  is  the  value  of  the  derivative  at  the  nth  sampling  point.    The  sample  values 
of  the  (n  +  1)  st  and  nth  sampling  points  are  used  to  choose  the  value  of    h         .  in 
the  following  manner: 

f  (x  . )   >  f  (x    )    then   h         .    =    h 

v    n  +  1  n;  n  +  1  n 

h 
f  (x         .)    <.  f  (x   )    then  h         .    =       n 
v    n  +  1  v    n  n  +  1  ~T~ 
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2.      Sequential  Min-Max  Methods 

Like  gradient  methods,  sequential  min-max  methods  operate  under  the 
assumption  that  the  objective  function  is  unimodal  in  the  experimental  region.     In 
general,  these  procedures  reduce  the  range  of  the  independent  variable  by  a  pre- 
determined amount.     Hence,   it  is  possible  to  determine,  before  taking  any  samples, 
the  number  of  iterations  required  to  reduce  the  range  of  the  independent  variable  a 
given  amount.    The  method  developed  by  Kiefer  [Ref.  7],  based  on  the  Fibonacci 
numbers,   is  the  most  popular  of  all  min-max  methods.     No  other  method  can  guaran- 
tee a  shorter  interval  of  uncertainty  for  all  functions  of  the  class  considered    (uni- 
modal functions  defined  in  the  experimental  region  [a,b].)    Another  related  method 
can  be  found  in  Berman  [Ref.   1],    Spang  and  Wilde  [Refs.   11  and  13]  outline  the 
general  approach  discovered  by  Kiefer. 

This  approach  assumes  that  the  maximum  of  the  objective  function  initially  lies 

within  an  interval  a    ,  b     as  illustrated  by  Fig.   1 .     If  two  sampling  points  are  selec- 
n       n 

ted  within  this  interval,  say  x        and  x     ,  where  n  is  the  iteration  number  and  such 
that  x       <  x  _  ,   it  is  obvious  that  if 

f  (x    ) *  >    f  (x    )  then  the  maximum  I  ies  between  a    ,  * 

n  n  n 

f  IX    )    <  f  x      )   then  the  maximum  lies  between  x  b 

\  2.  z        n 

f  (x    )    =  f  (x  9)  then  the  maximum  I  ies  between  x     ,  x  _ . 

Whenever    the  equality  condition  occurs,  either  the  interval  [a     ,  x       ]  or  [  x_,  b, 

is  selected  to  maintain  mathematical  symmetry.    Thus,  by  this  simple  test  the  size 

of  the  interval  guaranteed  to  contain  the  maximum  can  be  reduced. 

The  sampling  rule  is  based  on  the  total  number  of  tests,  N,  that  must  be 

performed.    N  can  be  determined  once  the  experimenter  has  specified  the  desired 

accuracy  in  estimating  the  true  location  of  the  maximum.    The  length  of  the  final 

interval,  b   -  a     =  6    specifies  the  desired  accuracy. 
N      N       N 
Kiefer  [Ref.  7]  has  shown  that  after  N  iterations 

N       2UN     u       u 
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Figure  1.     Illustration  for  sequential  min-max  method 
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where  U     is  the  value  of  the  N  th  Fibonacci  number  and  (  b_  -  a_  )  is  the  lenqth 
n  0        0  '  * 

of  the  original  interval,  see  also  Spang  [Ref.    11],    Solving  (4)  for  U     ,  the  value 
of  the  N  th  Fibonacci  number  is  determined.    Using  the  table  of  Fibonacci  numbers, 
the  corresponding  N,  which  is  the  total  number  of  tests  required  to  attain  the  de- 
sired accuracy  L    ,  can  be  found. 

With  N  established  the  algorithm  proceeds  in  the  following  manner: 

1.        n  UN-l-n       (b    -a  )  +a 

v     =  n        n  n 

1  U 

'  UN  +  1  -n 

xn     =       N-n      (b     _Q     )    +   Q 

2  ij  n         n  n 

N  +  1  -n 


n   v  r  /    n 


n 


2.  f(x1)>     f(x2)setOn  +  )    =    an,bn  +   ]    =    x2 

f(xn))<     f(xn2)      set   an+  ,   =   x^,bn  +  1    =   bn 

r  /    n   N  r  ,    n   .  .  n 

f  ( x     )    =     f  ( x     )     set    a  =    a   ,    b     u     =   x 

I  Z  n  +   I  n         n  +  1  I 

n       i  i 

or    set    a         ,    -  x,   .   b         ,    =    b    . 
n  +  1  1  n  +  1  n 

3.  n  =  N,  set  6.  .    =    b     -  a 

N  n        n 

n  ^  N,  go  to  1 . 

Either  f  (aM)or  f  (  b..  )    could  be  used  as  an  estimate  of  the  maximum. 

For  large  values  of  n  the  ratios  of  the  Fibonacci  numbers  given  in  step  1  of 
the  algorithm  approach  a  constant: 

U         1 
n  "  '    ~  0.382 


Un  +  1 


and 


U 
2_^    0.618 

Un  +  1 


Therefore  the  following  approximation  formulas  can  be  used  to  obtain  x  .    and  x«, 
f or  n  =    1 ,    .   .   .  ,  N : 
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x"    =   0.382  (b     -a     )  +a 
I  n         n  n 

x"     =   0.618  (b     -a     )  +  a  , 
I  n        n  n 

see  Spang  [Ref.   1 1], 

3.  Disadvantages  of  Gradient  and  Sequential  Min-Max  Methods 

The  major  drawback  of  these  methods  is  that  they  are  successful  only  if 
the  objective  function  is  unimodal,   i.e.,  has  only  one  hump  in  the  experimental 
region.     If  the  objective  function  does  not  satisfy  the  unimodal ity  requirement, 
these  methods  will  be  successful  in  reaching  a  local  maximum  at  the  best.    This 
is  obvious  because  the  methods  are  based  on  the  "hill-climbing"  principle  of 
moving  the  next  sampling  points  in  the  direction  in  which  the  function  increases. 
Since  in  many  practical  problems  it  is  not  possible  to  guarantee  unimodality  of  the 
objective  function,  it  is  important  to  develop  a  search  technique  which  finds  the 
maximum  of  a  multimodal  objective  function.    Furthermore,  gradient  methods 
usually  require  further  regularity  conditions  such  as  the  existence  of  the  first 
and  second  derivatives. 

4.  Random  Search  Methods 

Unlike  gradient  and  sequential  min-max  procedures,  random  search  pro- 
cedures are  not  restricted  to  unimodal  functions.    These  methods  are  nonsequential 
in  that  the  previous  sample  values  do  not  determine  exactly  where  the  next  sampling 
point  will  be  located.    Various  purely  random  methods  can  be  found  in  Brooks, 
Karnopp,  Spang  and  Zachharov  [Refs.  3,  6,    11,  and  14  ], 

The  general  procedure  is  to  select  the  sampling  points  at  random  in  the 
area  where  the  maximum  is  located  according  to  a  fixed  distribution.    After  a 
certain  number  of  iterations  have  been  performed,  the  brgest  value  of  the  objec- 
tive function  is  considered  to  be  the  estimate  of  the  maximum.    Assuming  that  the 
maximum  is  equally  likely  to  occur  anywhere  in  the  experimental  region,   [a,  b  J, 
let  (  b  -  a  )    =    d  be  the  length  of  this  interval .    With  no  prior  knowledge  con- 
cerning the  location  of  the  maximum,   it  is  reasonable  fo  use  a  flat  density  function 
over  the  interval  [  a,  b  ].    A  priori  ,  the  experimenter  specifies  the  accuracy  he 
desires  in  estimating  the  location  of  the  maximum  as  &_,  where  6      is  the  largest 
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interval  of  uncertainly  he  is  willing  to  accept  after  p  iterations.    The  interval 
[  a,  b  ]  is  further  divided  into  N  subintervals  each  of  length  6     .    The  ratio  of  a 
subinterval  to  the  original  interval  is 

g  : 


d 

The  probability  that  a  sampling  point  is  not  in  a  particular  interval  is  (  1  -  g  ) 
and  the  probability  that  it  is  still  not  in  this  interval  after  p  trials  is  (  1  -  g  )     . 
Thus,  the  probability  of  at  least  one  of  the  sampling  points  being  in  this  subinterval 
is 

s  =   1  -  (  1  -  g  )  P    .  (5) 

s  can  also  be  considered  as  the  confidence  level  of  one  of  the  sampling  points  being 
in  a  specified  interval.    Solving  (5)  for  p,  the  required  number  of  sampling  points 
can  be  determined  as 

P  =  loq  (  ]  ~s  )    .  (6) 

log  (  1  -g) 

Brooks  [Ref.  3]  and  Spang  [Ref.   11]  tabulate  the  number  of  iterations  required  for 
various  values  of  g  and  s.     It  can  be  seen  from  these  tables  that  the  number  of  sam- 
pling points  (iterations)  required  increase  quite  rapidly  with  a  reduction  in  g. 

Suppose  x.  is  the  value  of  the  sampling  point  at  the  i  th  iteration,  where 
i  runs  from  1  to  p,  p  determined  by  (6);  R.  is  the  value  of  a  random  number  between 

0  and  1  selected  from  a  uniform  distribution  for  the  i  th  iteration;  and  f.  is  the  esti- 

i 

mate  of  the  global  maximum  at  the  i  th  iteration;  then  the  algorithm  follows: 

1.  Set  i  =  1 

2.  Compute 

x    =  a  +  R  d 

f1   =  f  0c1) 

3.  Set  i  =  i  +  1 

Compute 

x.  =  a  +  R.d 
i  i 

f.    =    max   {  f.  _      ,  f  (  x.  )  } 
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4.      i  -  p,  estimate  the  global  maximum  to  be  f    at  x    . 

P         P 
i  ^  p,  go  to  3. 

5.    Grid  Search  Methods 

Grid  search  procedures  are  systematic  in  the  sense  that  the  sampling 
points  are  equally  spaced  a  predetermined  distance  apart  in  the  experimental 
region.    The  sample  values  for  all  sampling  points  are  obtained  and  that  which  is 
the  largest  is  considered  the  estimate  of  the  maximum. 

Like  random  procedures  and  unlike  gradient  and  sequential  min-max 
procedures,  these  methods  are  successful  in  estimating  the  global  maximum  and 
its  location  for  a  multimodal  objective  function  over  the  experimental  region. 
These  methods  are  also  nonsequential.    A  method  for  grid  search  can  be  found  in 
Spang  [Ref.   1 1], 

In  general,  the  experimental  region  [  a,  b  ]  is  subdivided  into  N  intervals 
of  length  6      ,  where  6       is  the  accuracy  the  experimenter  is  willing  to  accept  in 
estimating  the  location  of  the  global  maximum  for  the  objective  function  in  [a,  b  ]. 
The  number  of  sampling  points,  p,  required  by  this  procedure  can  easily  be  computed 
as 

p=rN-  <7> 

where  d  is  the  length  of  the  original  interval.    This  is  about  half  as  many  iterations 

as  are  required  by  purely  random  methods  to  attain  the  same  accuracy. 

Suppose  x.  is  the  value  of  the  sampling  point  at  the  i  th  iteration,  where 
i  j 

i  runs  from  1  to  p,  p  determined  by  (7);  —. ■      is  the  length  of  the  equidistant   inter- 
vals; and  f.  is  the  estimate  of  the  global  maximum  at  the  i  th  iteration;  then  the 
algorithm  follows: 

1.  Set  i  =    1 

2.  Compute 

x      =   a  +  fL 
6N 

?,    =   f  (x1  ) 
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3.  Set  i  =  i  +  1 
Compute 

X.    =     X.  +   

I  1-1 

6N 

f.    =   max    {    f.  _  ]f  f  (x.  )    } 

A 

4.  i  =  p,  estimate  global  maximum  to  be  f    at  x    . 

P  P 

'  ^  P/  9°  TO  3. 

6.  Disadvantages  of  Random  and  Grid  Search  Methods 

The  major  drawback  to  random  procedures  is  that  the  maximum  is  found 
only  with  some  probability  as  long  as  the  number  of  samples  is  finite.    Furthermore, 
being  nonsequential,  random  methods  as  well  as  grid  search  procedures  require  a 
very  large  number  of  samples  to  estimate  the  maximum  and  its  location  with  rea- 
sonable confidence  level  and  residual  uncertainty.    Although  grid  search  procedures 
require  about  half  the  number  of  sample  points  required  by  random  methods  to 
attain  the  same  accuracy,  the  number  of  samples  required  is  still  too  large  to  be 
practical  in  many  situations. 

7.  Methods  Combining  Gradient  and  Nonsequential  Search  Procedures 

Several  attempts  have  been  made  to  combine  the  "hill-climbing" 
principle  with  nonsequential  search  to  maximize  a  multimodal  function  over  the 
experimental  region.    For  various  methods  utilizing  these  procedures  see  Hill, 
Hartley,  Maryas,   Pijavskii,  Vaysbord  and  Yudin  [Ref.  4,  5,  8,  9,  and  12]. 

Basically  two  approaches  are  used  in  this  type  of  search:    (1)    finite 
random  or  deterministic  global  search  procedures  are  used  to  locate  favorable 
starting  points  in  the  experimental  region  with  gradient  methods  applied  in  the 
intervals  specified  by  the  starting  points;  and  (2)    gradient  methods  are  applied 
in  the  current  interval  being  searched  and  at  some  random  time  during  the  search 
of  this  interval  the  search  goes  to  another  randomly  selected  interval  in  the  exper- 
imental region;  Matyas,  and  Vaysbord  and  Yudin  [Refs.  8  and  12]  illustrate  this 
approach. 

In  approach  (1)  the  experimental  region  [  a,  b  ]  is  divided  into  N  sub- 
intervals  by  some  finite  random  or  deterministic  method.    Each  of  the  intervals 
specified  by  this  procedure  are  then  searched  by  gradient  methods.    The  largest 
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sample  value  obtained  is  the  estimate  of  the  global  maximum.    The  drawback  of 
this  method  is  that  it  does  not  guarantee  that  the  global  maximum  will  eventually 
be  found. 

Approach  (2)  uses  gradient  methods  about  the  initial  sampling  point  until 
a  random  trial  moves  the  search  to  another  interval.    After  each  sample  is  observed 
this  random  trial  is  performed.    If  the  procedure  moves  to  another  interval,  still 
another  random  trial  is  used  to  determine  the  exact  interval  to  be  searched.    The 
method  continues  until  the  stopping  rule  is  satisfied  at  which  time  the  estimate  for 
the  global  maximum  is  considered  to  be  the  highest  sample  value  attained.    The 
drawback  of  this  method  is  that  the  disadvantages  of  purely  random  methods  pre- 
vail, namely,  a  very  large  number  of  iterations  (samples)  are  required. 
8.      Conclusions 

The  methods  discussed  above  are  either  too  stringent  on  regularity  condi- 
tions and  the  requirement  that  the  function  be  unimodal  in  the  experimental  region 
or  impractical  from  the  standpoint  of  the  number  of  iterations  required.    Further- 
more, none  of  these  methods  provide  a  truly  sequential  approach  to  the  problem 
of  multimodality . 

C.     APPROACH  OF  THE  METHOD  TO  BE  CONSIDERED 

In  this  research  the  approach  will  be  to  solve  the  maximization  problem  of 
multimodality  in  terms  of  a  sequential  sampling  rule  which  is  easily  implemented. 
The  formulation  will  restrict  the  class  of  admissable  functions  to  be  maximized  to 
those  of  a  single  variable  and  which  are  globally  Lipschitzian.     !n  addition  it  will 
be  assumed  that  there  are  never  any  errors  present  in  the  observed  values  of  the 
function.    The  assumption  that  the  function  be  globally  Lipschitzian  means  that 
there  is  some  constant  C,  the  value  of  which  is  known,  such  that 

C£     |    f(x)-    f(x')    I  (8) 

I    x    -    x 

for  any  x,  x  ,  x  d  x    ,   in  the  experimental  region  [a,  b] .     If  the  function  is  dif- 
ferentiable,  the  value  of  C  is  usually  not  too  difficult  to  compute.     It  amounts  to 
finding  some  upperbound  on  the  function's  derivative.     In  the  case  where  the 
function  is  given  empirically,  the  constant  C  can  often  be  obtained  from  the 
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physical  nature  of  the  function.     If  the  exact  form  of  the  function  is  unknown,  the 
selection  of  C  will  have  to  be  based  on  the  experimenter's  subjective  judgement. 
In  any  case,  if  it  is  desired  to  estimate  the  value  or  location  of  the  maximum  with 
a  predetermined  accuracy,  knowledge  of  C  or  some  equivalent  information  is 
necessary  to  determine  a  stopping  rule  regardless  of  the  method  used.    This  is 
obvious  since  if  nothing  of  this  sort  is  known  or  assumed  about  the  function,  no 
conclusion  can  be  drawn  about  the  estimation  error  from  a  finite  number  of  samples. 

The  sampling  rule  for  the  maximization  method  under  consideration  and  the 
convergence  of  the  method  are  discussed  in  Chapter  II.    Chapter  III  describes  the 
formulation  of  the  computerized  algorithm  based  on  the  procedures  specified  by 
the  sampling  rule  and  the  convergence  criteria.     It  will  be  described  in  Chapter  IV 
how  the  algorithm,  slightly  modified,  can  be  used  to  estimate  the  zeros  of  a  func- 
tion.   Several  sample  problems,  ranging  in  degree  of  difficulty,  were  selected 
and  solved  by  the  computerized  algorithm  resulting  from  this  research  in  an  attempt 
to  test  the  method  and  as  a  basis  for  comparison  with  the  solutions  obtained  by 
other  methods.    The  sample  problems  and  their  solutions  are  discussed  in   Chapter  V, 
An  experiment  was  performed  to  test  the  algorithm's  sensitivity  to  the  Lipschitzian 
constant  and  the  shape  of  the  objective  function.    The  experimental  procedure, 
results  and  conclusions  are  discussed  in  Chapter  VI.    The  results  and  conclusions 
of  this  research  are  discussed  in  Chapters  VII  and  VIII  and  the  recommendations 
for  future  research  are  discussed  in  Chapter  IX.    Appendices  A  -  F  contain  the 
flow  charts  and  program  listings  for  the  method  under  consideration. 
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II.     DEVELOPMENT  OF  THE  SAMPLE  RULE 
FOR  THE  MAXIMIZATION  METHOD  TO 
BE  CONSIDERED 


Consider  the  sequence  of  sampling  points  {  xn,  x    ,    . . .  ,  x     }  in  [  a,  b  ] 

U        I  n 

and  their  corresponding  sample  values  {  f  (  x_  ) ,  f  (  x     )  . . .   ,  f  (  x     )  }  .    By  (8) 
f  (  X  )    <    f  (  xQ  )    +        C    |    X  -  XQ     J 

f  (x)    <    f  (X]  )     +       C    I    x-X]      | 


Define 


f ( x  )  *  f  ( xn  )    +      C    I  x  -  xn    I    . 


Fn(x)  =  r=0/1 ntf(v  +  c  i*-\  15'      w 


to  be  the  piecewise  linear  function  passing  through  the  points  (  x_,  f  (  x     )    ), 

(  X  .  3  f  (  x.  )  ),    . . .   ,   (  x    ,  f  (  x     )    )  with  the  slope  determined  by  the 
II  n  n 

Lipschitzian  constant  C,  defined  by  (8).     Figure  2  illustrates  the  function  F 
defined  by  (9). 

From  Fig.  2  it  can  be  seen  that  for  n  samples  the  whole  function  f  is  upper- 
bounded  by  F     with  at  most  (  n  +    2  )  peaks  (including  possible  peaks  at  the  end- 
points  of  the  interval  [  a,  b  ]   .   ) 

Define 


Z     =    max  F     (  x  )  . 

n  n 

x  e   [  a,  b  ] 

Clearly.  Z    will  be  the  ordinate  of  the  highest  peak  of  F    .     Let 
'        n  n 


(10) 


cp    =     max  f  (  x  ) 

x    e     [  a,  b  ] 

be  the  global  maximum  of  the  function  f  and  let 

con    =     max  {  f  (xQ  ),...,  f  (  x     )  }  (11) 

be  the  estimate  of  cp  after  n  iterations  (samples)  have  been  observed. 
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f(x3) 


a        x 


0 


b  =x. 


Figure  2.     Graph  of  function 


F     (x  )  =min  {    f  (.x  .    )    +    C        x  -x.     |   }    . 

k  -  V,    i ,    . . .,  n 
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Since  F     upperbounds  f  and  cp      is  the  largest  sampie  value  observed  so  far,  the 
following  is  concluded: 

cp      ^  cp    <   Z    .  (12) 

T  n  n 

It  seems  plausible  at  this  point  to  utilize  the  difference  between  Z    and    cp     as 

n  n 

a  basis  for  selecting  the  sample  rule  under  consideration.    Define 

e    =   Z     -   J  (13) 

n  n  n 

as  the  maximum  possible  error  between  cp  and  cp      after  n  observations.    Furthermore, 

let  g;     be  the  abscissa  of  Z    .    Since  by  (12)  the  global  maximum  m  is  between 

Z      and     cp    ,  it  follows  that  as  e     defined  by  (13)  decreases,  the  global  maximum  cp 
n  n  n  Y 

can  be  more  accurately  determined. 

The  method  under  consideration  seeks  that  choice  of  x         ,  that  will  minimize 

n  +  1 

e    .    E      is  the  optimal  selection  in  the  minimax  sense,   in  that  any  other  choice  of 
n      ^n  r 

x         ,    could  fail  to  decrease  e     by  the  same  or  larger  amount. 
n  +  1  n 

Since  the  above  procedure  is  both  sequential  and  optimal  in  a  minimax  sense, 
the  sampling  rule  of  selecting  the  abscissa  of  the  highest  peak  in  F     is  used  for  the 
method  under  consideration.    That  the  sampling  rule  for  the  method  under  consideration 
is  in  fact  optimal  relative  to  the  class  of  all  functions  that  are  Lipschitzian  is  proved 
by  Shubert  [Ref.   10]. 

The  sampling  sequence  for  this  method  is  defined  mathematically  as  follows: 

xQ    g    [  a,  b  ] 


where  x_  is  the  initial  sampling  point, 


x         ,      such  that 
n  +  1 

F     (x  )  =    Z    ,  n  =0,   1,   ...  , 

n        n  +  I  n 


otherwise  arbitrary,  where 


Z     =    max  F     (  x  )  , 

x  €    I  a,  b  ] 

F     (x)=   min  {    f  (  x     )    +    C  |    x  -  x       |    }    . 

k  =  0,   1,   . . .,  n 
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The  nature  of  the  sampling  sequence  for  this  method  suggests  that  as  n-*co, 

£      t  cp    /    Z.      ^    „  ,  and  e      -»  0.    Shubert  [Ref.   10]  theoretically  proves  that 

these  conditions  are  satisfied  for  the  sampling  rule  just  considered. 

The  rate  at  which  the  error,  e    ,  defined  by  (13)  approaches  zero  is  worthy  of 

n 

consideration  at  this  point.    Shubert  [Ref.   10]  has  shown  that  the  slowest  possible 

rate  at  which  e    approaches  zero  occurs  when  f  =    constant,  for  any  arbitrary 

selection  of  the  initial  sampling  point  x    .     It  remains  to  be  seen  how  fast  e 

0  n 

approaches  zero  when  the  initial  sampling  point  is 

(a+b) 

x0  2 

The  speed  of  convergence  in  light  of  the  conditions  specified  above  will  now  be 
studied. 
Suppose 

Y  =    {   x    e     [a,  b]  :    f  (x  )    =  rp    }  (14) 

is  the  set  of  all  x  for  which  the  global  maximum  is  attained.    Furthermore,  since  the 
case  is  being  considered  for  f  =   any  constant,  let  the  constant  be  zero  for  ease  of 
illustration.    Define  C    >    0    as  the  value  of  the  Lipschitzian  constant.    By  defini- 
tion f  (  x  )    =    0    for  every    x    e     [  a,  b  ],  hence, cp      =    0    for  all  n  and 


A 

Y  =    [a,  b  ]. 


A  further  implication  about  e     defined  by  (13)  can  be  made  since 

-        -7  A 

e      -     Z      -  cp 
n  n  n 

and  cp      =    0 

n 

for  all  n  then  e     =    Z     for  all  n. 
n  n 

Utilizing  the  function  F     defined  by  (9)  and  the  sampling  rule  for  the  method 

n 

under  consideration,  a  general  pattern  for  the  rate  at  which  the  error  e     decreases 
can  be  determined.    Figure  3  illustrates  the  sampling  sequence  when  f  =    constant. 

It  can  be  seen  from  Fig.  3  that  each  sampling  point  after  n  =  2  increases  the 
number  of  peaks  in  F         .by  two  and  that  the  two  new  peaks  created  are  equal  in 
height.    Furthermore,   it  can  be  easily  shown  that  the  height  of  these  two  new  peaks 
is  equal  to  half  the  height  of  the  peak  that  created  them,  see  Fig.  4.    The  calcula- 
tions illustrated  in  Fig.  4  are  as  follows: 
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3  6  0  7  4 

Figure  3.     Illustration  of  sampling    sequence  when  f  =  0, 


b  =x. 
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Figure  4.    Illustration  to  show  that  z.  -    z     -  -rr 


1 

2      Zn 
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Hence, 


c  = 

z  ■ 

n 

"zi 

5„ 

"  5, 

Zl 

■c  = 

5|" 

?n 

z  - 

n 

Z. 

.   Zl 

V 

S| 

<„" 

?i 

z    - 

n 

Zl 

"  zl 

Zl  = 

z 

n 

z 

Z!    = 

Z 
r 

n 

~2~* 

z 

C=-L 


n 


Z     -   Z 

-C  =  -= L 


z  z   -  z 

r  n  r 


z    =   z  -  z 

r  n  r 


Z 

z    =   JL 

r  o 


The  results  of  the  computations  performed  above  are  tabulated  in  Table  I,  for 
n  =  16.    The  purpose  of  Table  I  is  to  outline  the  sampling  sequence  so  that  a  general 
pattern  of  the  way  e     decreases  can  be  determined. 

Let  k  be  the  length  of  the  interval  during  which  the  value  of  Z     does  not  change 

Then  after  two  samples  have  been  taken,  exclusive  of  the  initial  one,  it  can  be  seen 

from  Table  I  that  k  increases  with  powers  of  two  while  Z     decreases  at  the  same  rate. 

n 

The  above  reasoning  can  be  stated  mathematically  in  the  following  manner  : 


If 

0k                       k  +  1 
2     <    n     <    2 

then 

_    C    (b-a) 

2k    +    1 

0k  +  l 

2             >  n 

implies  that 


JTT+1    *n 
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n 

z 
n 

1 

C(b- a)/2 

'.-'1 

2 

C  (b-a)/2 

f 

3 

C(b-a  )/4 

4 

C  (b  -a  )/4 

I 

5 

C (b -a )/8 

k  =  2> 

6 

C (b -a )/8 

7 

C  (b  -a  )/8 

8 

C (b -a )/8 

9 

C  (b  -a  )/  16 

10 

C  (b-a  )/16 

11 

C  (b  -a  )/  16 

k  =  3^ 

12 

C  (b  -a  )/16 

13 

C  (b  -a  )/16 

14 

C  (b  -a  )/  16 

15 

C  (b  -a  )/16 

16 

C  (b  -a  )/  16 

Table  I.    Illustration  of  general  pattern  for  decreasing  e^ 
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Hence 

C(b-a) 
n  n 

and  it  follows  that  if  f  =  constant,  the  error,  e    /  decreases  at  least  as  fast  as 

'  '     n 

C  (b  -a  ) 


Of  course,  the  estimation  error  resulting  from  a  nonsequential  (random  or  grid) 

search  would  also  decrease  as  —  .     However,  as  experimental  results  of  Chapter  V 

n 

indicate,  the  actual  rate  of  the  sequential  algorithm  is  typically  much  faster. 

To  locate  the  global  maximum  in  the  experimental  region  it  is  necessary  to 
determine  the  set  Y  of  all  x  e    [  a,  b  ]  at  which  the  global  maximum  is  attained. 

Let 

Yn    =     {   x  ,    [a,  b]  :    F     (x)    >    %       }     ,  (15) 

n  n  n 

for  n  =   0,   1,    ...,  where  F     is  the  function  defined  by  (9).    Since 


A 

n  yn  + 


<P  «     *     9„  4.   i       ^     <P 


and  f  ( x  )  €.    F      .   .    (  x  )  <    F     (  x  )   , 

n  +  I  n 

for  every  x  e   [  a,  b  ]  it  follows  that 

Y  <=  y  <z.  Y     ,  n  =  0,    1,    ...     .  (16) 

n  +  I  n 

Figure  5.  illustrates  the  heuristic  approach  to  locating  the  global  maximum  cp  . 

Consider  the  situation  after  (  n  +  1  )  samples  have  been  observed.    The  heavy  solid 

lines  in  Fig.  5  represent  the  uncertainty  in  the  location  of  the  maximum  if 

cp         .     ^    cp    .    If   cp         ,    =   cd    then  the  intervals  of  uncertainty  remain  the  same 
n  +  ln  n+ln 

and  only  F     (  x  )  changes.    However,  if   cp         ,    =    f  (  x  .  )  ,  then  the  estimate 

7     n  a  '         ^n  +  1  v      n  +  1 

moves  cioser  to  ^from  below  and  as  a  result  the  intervals  of  uncertainty  will  be 

reduced  in  length  and  more  accurately  determine  the  abscissae  for  locating    cp  . 

Clearly,  Y    is  the  smallest  subset  of  [  a,  b  ]  that  defines  the  location  of  cp  , 

Y  ,  contains  all  elements  of    Y    but  may  contain  several  elements  that  do  not  lead 

n  +  1  ' 

to  the  location  of  cp  .    Finally,  the  largest  uncertainty  set  obtained  from  the  sampling 

sequence  is    Y      .    Thus  , 
n 
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n  +  1 


Figure  5.    Locating  the  global   maximum. 
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Y^Y         ,  <^  Y     ,  n  =  0,  1,  ...,  defined  by  (15).    Furthermore, 
n  +  I  n 

it  has  been  shown  by  Shubert  [Ref.   10]  that  the  Lebesque  measure  of  y     ~  Y  con- 

n 

verges  to  zero  as  n— »  «  . 
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Ml.     DEVELOPMENT  OF  THE  COMPUTERIZED  ALGORITHM 

A.     PRELIMINARY  CONSIDERATIONS 

The  sampling  rule  developed  in  Chapter  II  lends  itself  nicely  to  the  formulation 
of  a  computerized  algorithm  for  estimating  the  global  maximum  and  its  location  of 
any  deterministic   function  of  a  single  real  variable  defined  in  a  closed  interval 
[a,b]. 

The  information  gathered  from  all  samples  obtained  so  far  must  be  stored  in  the 
computer's  memory  so  that  it  can  make  the  proper  decision  as  to  where  the  next 
sampling  point  should  be  located. 

Define  the  data  set  stored  in  the  computer's  memory  after  n  samples  have  been 
observed  as 

Dn    =     ^  '  1 '  Z  P  '    ^  f2  '  Z2  ^  '   *  *  *  '    **  fH    '  ZH     ^  '  ^  n    ^ 

n  n 

where  z     <  z      <    ...    <  z      ,  and  the  vectors  (  t. ,  z.  )  ,  i  =  1 ,  2,    . . . ,   H   ,  are  the 
I  H  ii  n 

n 

coordinates  of  the  maxima  of  F     defined  by  (9). 

n 

Let 

1 
_  __    (a  +b  ) 

0         2 

be  the  initial  sampling  point  and  f  (  x     )  the  corresponding  sample  value.    Further- 
more, 

Za  -    f(xQ)   -C    |a-X()    |   , 

Zb    =     f  ( xQ  )     C|b-x0|     . 

Clearly,  z     and  z     are  the  two  maxima  of  FA  (  x  )  defined  by  (9)  at  the  endpoints  of 
a  b  U 

the  experimental  region  [  a,  b  ].    z      =     z,    since  (  b  -  x     )    =    -  (  a    -  xn  )  implies 

a  b  0  u 

that  zb    -    f  (  xQ  )    -  C  (  a  -  xQ  )    -    zq/  see  Fig.  6.     5Q    =    f  (        }  i$  defined  Qs 

the  estimate  of  -oafter  the  zero  (th)  iteration.     Let  t     =   a,  and  t.    =    b  be  the  abscis- 

a  b 

sae  of  z    and  z,     respectively. 
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Figure  6.    Zeroth  iteration. 
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B.     FIRST  ITERATION 

Define  F_  (  x  )  as  the  piecewise  linear  function  connecting  the  points  (  a,  z     )t 

(  xn/  f  (  xn  )    )  ,     (  b,  z.)  .    For  this  iteration,  maxima  of  F_  (  x  )  consist  of  the 
U  U  b  U 

vectors  (  a,z     )  ,   (  b,  z,    )  .    Since  z      =    z    ,  the  arrangement  of  the  vectors  in 
a  b  a  b 

D_  is  completely  arbitrary.    Suppose  t        =     t     ,  then 


DQ=    {    (b,zb),(a,za);   $q  j    . 


By  (10) 


Z«       =      Z..  =      Z 

0  HQ  a 


and  by  the  sampling  rule 


The  corresponding  sample  value  for  x.    -    a  is    f  (  x  .  )    =     f(a).    Drop  the  vector 

(  t       ,  z        )    =    (a,  z   )  from  the  data  set  DA  and  add  the  new  vector  (  t  ,  z    ) 
HQ       HQ  a  0  r       r 


where 

z     =      _L    (  z      +     f  (  a  )    ) 
r  2  a 

t=a+—       (z-f(a)) 
r  2  C    V    a         v  ', 

see  Fig.  6.    The  new  set  of  data  D.  is  then  obtained  by  rearranging   the  vectors 
(  b,  z,    )  and  (  t  ,  z    )  in  the  order  of  nondecreasing  second  component.     It  is 

obvious  that 

zr  ^   Za    =     Zb  ' 

Thus, 

D,=    tVZr),(b,zb);    5]} 

where  cp    =    max    [cPA#f(a)    }    .    D,is  again  the  set  of  coordinates  for  all 

1  °  ' 

maxima  of  F     (  x  )  defined  by  the  piecewise  linear  function  connecting  the  points 

(a  ,  f  (a)  )  ,  (tr,  zr),  (xQ,  f  (xQ)    ),  (b,  f  (b)    ),  see  Fig.  7. 

35 


z     -  z 
a        r 


f(a) 


Figure  7.    First  iteration  . 
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C.     SECOND  ITERATION 

By  (  10  )     z     =    z        =    z     and  by  the  sampling  rule  x     =    t        =    b.    The 
I  H.  b  z         n, 

corresponding  sample  value  for  x0  =   b  is  f  (  b  )  .    Drop  the  vector  (t      ,  z      )  = 

2  H1       H1 

(  b,  z,    )  from  the  data  set  D     and  add  the  new  vector  (t.,  z.)  where 
b  I  II 

z,    =    [    zb    +     f  (  b  )  )    /    2 
f|     =   b  "     [zb-f(b)]/2C 

see  Fig.  8.    Arrange  the  data  set  such  that 

);  A 
'2      "2 


02=    {    (f]/z1  ),   (t^,  zHJ;    J  2    } 


where  z..     =    max  f    z.   ,  z     }    ,  t         the  corresponding  abscissa; 
H^  I         r  H  ~ 

z     =  min    {    z  ,  z      }    ,  t     the  corresponding  abscissa;  and 

cp2  =     max    {   cp  1  ,  f  (  tH     )    }    . 

D.     n  (th)  ITERATION 
For  n  >    3    , 

Dn   =    £   (VZ1  ' (fH   'ZH    );   '"n  3    • 

n  n 

By  (10),  z     =    z       and  by  the  sampling  rule  x         .    =   t      .    The  corresponding 

n  n 

sample  value  for  x         .    =    t        is    f  (  t       ),   Drop  the  vector  (  t      ,  z       ) 
n  +  I  n  n 


from 
n  n 


the  data  set  and  add  the  two  new  vectors     (  t .  ,  z     )  ,   (  t  ,  z    )  where 

z,=zr=     [zH      +    f(tH)]/2 

n  n 

t,-tH      -    [zH       -    f(tH)]/2C 

n  n  n 

\    "tH     +    [zH     -f(,H    )]/2C 
n  n  n 


see  Fig.  9.    The  new  set  of  data,  D  ,  is  obtained  by  rearranging  the  vectors 

n  +  1 

(*,/  Zj)  /  .../  (*H  _  !  ,    zH         1  )  /  (  tj,  z  ,  ),   (  t      zr  )  in  the  order  of 


37 


i  z.-z! 


■  f  (  b  ) 


Figure  8.    Second  iteration. 
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H 


Figure  9.       nth  iteration 
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nondecreasing  second  component.    Thus,  D         .    is  the  set  of  coordinates  for  all  the 

n  +  I 

maxima  of  F      ,   .  defined  by  (9)  and  cp  =     max   (   cp    ,  f  (  t, .     )    1   . 

n  +  1  7  y  n  +  1  l   Tn        v    H     '    J 

n 

E.  STOPPING  RULE 

The  iterations  continue  until 

A 

Zu       ~       CD         ^     €     , 

H  n 

n 

where  e  >    0  is  the  desired  accuracy  in  estimating  the  global  maximum  cp ,  and  cp 
is  the  largest  value  for  f  observed  so  far,  see  Fig.   10.    The  experimenter  should  be 
realistic  when  he  specifies  e  .     Naturally,  he  would  like  the  error  to  be  zero,  how- 
ever, this  would  be  feasible  in  a  finite  number  of  iterations  only  if  the  function  f 

coincides  with  the  function  F     in  the  area  of  the  maximum.    Since  this  is  usually 

n 

not  the  case,  he  must  be  willing  to  accept  some  error  between  cc  and  cp  rna*"  will 
allow  the  computer  to  execute  the  algorithm  in  a  reasonable  amount  of  time.  Cost 
will  increase  as  more  accuracy  is  desired. 

F.  REDUCTION  OF  DATA 

The  main  computational  disadvantage  of  the  algorithm  as  described  so  far  is  that 

from  the  second  iteration  on  the  memory  content  increases  by  one  vector  (  t.,  z.  )  at 

ii 

each  iteration,  i  .e.,  H      =    n  for  n>  3.    This  can  be  remedied  to  some  extent  by 

dropping  at  each  iteration  all  vectors  (  t. ,  z.  )   e    D    such  that  z.   <    cp         ,   ,  see 
rr  i       i  n  i  n  +  1 

Fig.  11.    Although  the  function  F     is  no  longer  being  stored  completely,  it  is  easy 

n 

to  see  that  the  sequence  of  sampling  points  generated  by  the  algorithm  remains  the 
same  as  before.     It  is  clear  from  Fig.   11  that  since  F     upperbounds  f  and  cp  ^  cpn  +  \ 

the  abscissae  of  all  maxima  of  F    below  S       ,   ,   will    never  be  sampled  anyway,  due 

n  n  +  I 

to  the  very  nature  of  the  sampling  procedure.      The  shape  of  the  function  in  the  ex- 
perimental region  will  determine  how  fast  the  memory  content  increases.     It  has  been 
shown  in  Chapter  II  that  in  the  most  unfavorable  case,  f  =    constant,   it  will  increase 
by  one  vector  per  iteration. 
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"H 


Figure  10.    Stopping  Rule 


H 
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Drop  all  (  t  ,  z.  )    e      D    such 
i       i  n 

that  z.  <   cp  =£   (  t   ,  z     ) 

i  n  +    I  II 

would  be  dropped  from  D    . 

n 


t. 


t. 


1  3 

Figure  11.    Reduction  of  data. 
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G.     LOCATING  THE  GLOBAL  MAXIMUM 

Define  D       as  the  data  set 

e 

Dn  +  1    =    [   'W °H    +    1,ZH     +    l);^n  +  l    ] 

n  n 

in  the  computer's  memory  after  the  final  iteration  of  the  algorithm  as  described  so 

far.    Furthermore,  define  F  s   as  the  corresponding  function  of  F  defined  by  (9), 

D      then  contains  coordinates  for  all  maxima  of  F    which  upperbounds  the  function 

f(x),i.e.,f(x)<sF      (  x  )  by  (9).    Thus,  the  second  components  z.  of  all 

(  t. /  z.  )  e    D      are      e  ~  close  to  cp  and  are  at  least  as  great  as  cp      =    cp 
iie  e  n  +  1 

after  the  last  iteration.    The  information  contained  in  D       provides  many  alterna- 
tives for  estimating  the  location  (s )  of  cp   in  [  a,  b  ],  depending  on  the  desires  of 
the  experimenter.    Four  of  these  alternatives  will  be  examined  below. 

1.  Alternative  I 

This  alternative  could  be  used  if  the  experimenter  is  interested  in  a  single 
estimate  for  the  location  of  cpin  [  a,  b  ].     Only  a  slight  modification  to  the  estab- 
lished algorithm  is  necessary  to  obtain  these  results.    Simply  carry  along  the  abscis- 
sa of  the  largest  sample  value  obtained  so  far  in  the  data  set  D    .    Thus,  the  revised 

n 

data  set  at  each  iteration  becomes 

D      ={    (V  z.  ),...,  (t      ,  z       -);  (xA    ,  <p     )   }    . 

n  II  n         n  cp  n 

n  n  xn 

A,  /y 

At  the  final  iteration  (  x      ,  m     )  will  be  the  coordinates  of  the  estimated  cpand  its 

e  e 

location  in  [a,  b  ] . 

The  major  drawback  to  this  alternative  is  that  it  fails  to  estimate  the  loca- 
tion of  all  global  maxima  in  [  a,  b  ],  however,  it  is  relatively  simple  and  particu- 
larly useful  if  the  function  is  unimodal  or  when  only  a  single  estimate  for  the  loca- 
tion of  the  maximum  is  desired. 

2.  ALTERNATIVE  II 

This  alternative  may  be  successful  in  estimating  all  locations  of  the  global 

maximum  in  [  a,  b  ].    D      contains  all  information  necessary  to  obtain  these  results. 

e 

From  the  set  of  first  components  of  (  t  ,  z.  )  in  D       ,  obtain  a  new  data 

i       •  e 

set 
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A 


G    -     {(*,,  y,  ),...,(  tu  ,  y-u    );  q>  g   }  f 


e         e 


where  y .    -    f  (  t.  )  ,  i  -  1 ,   . . . ,  H 

ii  e 

Define 

T    =    {    t.    :    (  t.,  y.  )    e    G,  y.  >    cp  e  } 
III  I 

as  the  set  of  points  estimating  the  location  of  oin  [a,  b  ],  see  Fig.   12. 

The  major  drawback  of  this  alternative  is  that  an  uncertainty  set  con- 
taining the  locations  of  global  maxima  in  [  a,  b  ]  cannot  be  precisely  determined, 
x    e     T  implies  that 

cp  -  f  (x  )  <;    e    , 

however,  the  true  locations  may  still  be  outside  of  T.    Figure  12  illustrates  why 
this  method  may  not  be  successful  in  estimating  the  locations  of  all  global  maxima. 
For  this  particular  example,  (  t«  ,  y«  )  /    T  because  y«    <   cp       .    Even  though  t^ 
is  close  to  an  abscissa  for  which    cd   is  attained  it  is  not  considered  in  this  case.    The 
abscissa  t      is  less  close  to  an  abscissa  for  which  cp  is  attained  than  is  t_  and  yet  this 
value  is  considered  an  estimate  by  this  method. 

Despite  the  fact  that  the  method  fails  in  this  respect,   it  may  still  be  used, 
with  some  reservations,  if  more  than  one  location  estimate  is  desired.    The  program 
listing  for  this  alternative  is  located  in  Appendix  E. 
3.     Alternative  III 

This  alternative  results  in  the  genuine  uncertainty  set  of  the  smallest  pos- 
sible size.  From  the  considerations  at  the  end  of  Chapter  II,  it  follows  that  this  set 
is  given  by 

V  =     {   x    e    [a,  b]  :    F      ( x  )    >    %     }    . 

e  e 

From  the  piecewise  linear  character  of  the  function  F    it  follows  further  that  the  set 

e 

V  is  a  union  of  a  finite  number  of  disjoint  closed  intervals 

V  =    U   [x        x     ]. 

i         I.        r. 
i  i 

The  following  computations  are  needed  to  obtain  V  from  the  algorithm:    Rearrange  all 
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a    t 


1  2 

Figure  12.    Alternative 
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(  t.  ,  z.  )   e    D    In  nondecreasing  order  of  first  components.    Then   compute: 
i        i  e 

1     /  A     \ 

xi,  =  'i-c-t'i"  ».) 

unless  x      as  computed  above  is  less  than  the  left  most  endpoint  of  the  interval 
1 

[  a,  b  ]  at  which  time  x      becomes  a.    That  is, 

1 

x.      =     max    {   a,  t     -  1    (  z     -    J    )    }   . 

'i  i     c      '       « 

x       =     min     {   b,  t    +  — -  (  z.   -    J    )   }    > 
r.  •        L         i  e 

x,  =      max    f   a,  t. ( z.    -    cp     )    }    • 

'i  +  1  .     c      »       •« 

The  iterations  continue  until 

x       <     x. 
r.  I.         , 

i  i    +    l 

at  which  time  x     becomes  the  endpoint  of  the  first  interval  and  x.  the  left  — 

r.  I.         i 

i  i    +    l 

most  endpoint  of  the  second  interval.    This  procedure  continues  until  i    =    H    at 

e 

which  time  the  rightmost  endpoint  of  the  last  interval  is  computed  to  be 

x     =   x         .    The  union  of  all  these  intervals  belong  to  V.    Clearly.  Y^V.    See 
r.         r 
i  H 

e 

Fig.   13  for  an  illustration  of  this  approach. 

Although  this  alternative  provides  the  genuine  uncertainly  set  of  loca- 
tions for  cp  ,  the  number  of  intervals  in  the  set  becomes  unwieldy  and  too  clumsy  to 
be  practical.    However,  one  can  be  assured  that  cp   is  contained  in  at  least  one  of 
these  intervals.     The  flow  chart  and  program  listing  for  this  alternative  are  located 
in  Appendices  C  and  F  respectively. 
4.     Alternative  IV 

This  alternative  is  an  attempt  to  reduce  the  number  of  intervals  in  V,  for 

clarity,  at  the  expense  of  enlarging  the  uncertainty  set  containing  cp    .    The  nature 

of  the  sampling  rule  is  such  that  when  the  algorithm  stops,  the  first  components  of 

(  t.,  z.  )    e     D     cluster  about  y  defined  by  (14).    By  rearranging  the  vectors 
i       i  e 


46 


A 


Figure  13.    Sketch  of  Alternative  III. 
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(  t.,  z.  )    g    D     in  nondecreasing  order  of  first  component  ,    the  breakpoints  be- 
i       i  e 

tween  clusters  are  usually  well  defined  as  relatively  large  gaps  between  some  t., 

t.  in  the  rearranged  data  set.    Utilizing  the  endpoints  of  these  clusters  and  the 

function  F      ,  it  is  a  very  simple  task  to  consolidate  the  points  in  each  cluster  to  a 

e 
single    interval  of  uncertainty.    The  set  of  all  such  intervals  formed  in  this  manner 

is  the  set  of  intervals  under  consideration.    Clearly,  V  is  a  subset  of  the  union  of 

these  intervals,  since  the  new  set  will  include  those  portions  of  [  a,  b  ]  between  the 

intervals  in  V  which  do  not  contain  cp  .     Hence,  the  new  uncertainty  set  is  larger 

than  V  but  is  more  practical  for  presenting  experimental  results. 

Let  k  be  the  number  of  clusters  in  the  rearranged  data  set  and  let 

H    =    {[u,     ,u     ],.../.[  u  ,     ,    u      ]} 

'i    ri  'k     \ 

be  the  set  of  intervals  such  that  u.    is  the  leftmost  endpoint  in  the  i  th  cluster  and 

i 

u      is  the  rightmost  endpoint  of  the    i  th  cluster  in  H  ,   i  =    1,   ...,  k  ;  see  Fig.   14. 
i 

Clearly,  [  u.  ,  u     ]    e    H    does  not  define  the  entire  interval  of  uncertainty  under 
i         i 

consideration  for  the  i  th  cluster.    There  is  a  portion  of  [  a,  b  ]  to  the  left  of  u  . 

i 

and  a  portion  of  [  a,  b  ]  to  the  right  of  u     which  must  be  included  in  the  uncer- 


r. 
i 


tainty  set  under  consideration.    This  is  obvious  because 


A 

<P     £   cp    £    F 


which  implies  that  cp    could  very  well  be  contained  in  the  intervals  [  w.  ,  u.   ] 


i         i 


and  [  u    ,  w     ],  see  Fig.   14,  where 
^  1 


wL    =     u,      -c       (zL    -     cpe) 
i  i  > 


C  r. 

1 


w      =u       +     -pr   ( z_    "  «p .  )    • 
r.  r:  C  r.  e 


The  set  of  uncertainty  intervals  defined  by 
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Figure  14.    Sketch  of  uncertainty  set  W. 


A    __ 

cp 
e 


L_ 


i-th   cluster 


(  i  +  1  )  st  cluster 


intervals  forming  the  set  V  (  Alt.  III.  ) 


added  to  V  to  obtain  W  (  Alt.   IV0  ) 


O  points  forming  the  set  T  (  Alt.  II.  ). 
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W    =     {    [  w.     ,  w      ],...,[  w.     ,  w       ]    } 
1  1  1  k         k 

is  the  set  desired  for  this  alternative,.    The  flow   chart  and  program  listing  for  this 
alternative  have  purposely  been  deleted  from  the  Appendices  since  the  intervals 
can  be  rather  easily  obtained  from  results  of  Alternative  III. 
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IV.     THE  COMPUTERIZED  ALGORITHM  AS  A  MEANS  FOR 
LOCATING  ZEROS  OF  A  FUNCTION 

The  algorithm  resulting  from  this  research  can  also  be  used  to  locate  the  zeros  of 
a  function.     Only  a  slight  modification  to  the  algorithm  is  necessary  to  obtain  these 
resu  I  ts . 

By  setting 

g(x)  =  -  |  f  (x)   |  , 

the  transformation  is  such  that  all  global  maxima  cp  of  g  in  [  a,  b  ]  occur  at  the  zeros 
of  f  (  x  )  .    This  fact  allows  the  computerized  algorithm  to  conform  nicely  to  the  task 
at  hand.     If  the  function  f  has  at  least  one  zero,  ihe  global  maximum  of  g  is    cp  =    0, 
therefore  the  new  data  set  after  each  iteration  will  be 

Dn  +  1    "    I    <VZ1> (tH       +    1    'ZH      +    1);°]    • 

n  n 

The  iterations  continue  as  before  until 

Zu  Z  £       e 

H      -     cp    £    e      or         H 
n  n 

since  cp  =     0  ,  at  which  time  the  iterations  stop.    The  zeros  of  f  (  x  )  are  estimated 
by  applying  the  procedures  discussed  in  Alternatives  I  -  IV  of  Chapter  III.    The  alter- 
native used  will  depend  on  the  form  and  the  accuracy  desired  by  the  experimenter. 
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V.     SAMPLE  PROBLEMS 

The  results  of  the  following  sample  problems  were  used  as  criteria  for  (1)  test- 
ing the  algorithm;  and  (2)  comparison  with  results  of  the  same  problems  obtained  by 
other  search  procedures. 

A.     SAMPLE  PROBLEM  I 

Given  the  deterministic  function 

f (x )    =     3    +   x -x 

estimate  the  global  maximum  cp  and  its  location  in  the  experimental  region  [  0,  2  ] 
with  a  desired  accuracy  of  e   =    *01 . 
1 .      Discussion 

This  problem  was  used  throughout  the  design  stages  of  the  computerized 
algorithm  due  to  its  relative  simplicity.     It  was  primarily  used  in  the  design  stages 
because  the  global  maximum  and  its  location  could  easily  be  determined  by  differ- 
ential calculus. 

Clearly,  the  function  is  unimodal  in  [  0,  2  ],  see  Fig.   15,  and  by  the 
calculus, 

cp  =   3.25, 

Y  =    {   .5}    . 

Since  the  function  is  differentiable,  the  Lipschitzian  constant  C  can  be  computed 
as 


x  e   [  a,  b  ] 
2.     Results 


max  {    |   -2x    +    1    |   }  3    . 

x  e    [0,  2] 


For  the  desired  accuracy  of    e   =    .01,  the  algorithm  required  63 

iterations  (samples)  giving  the  estimate  cp      =    3.25.     It  was  determined  a  prior? 

e 

from  Fig.   15  that  the  number  of  global  maxima  was   I,  hence,  Alternative  I  was 
used  to  estimate  the  location  of  cp   .    The  algorithm  gave  x    =    .5    for  this  estimate. 
The  largest  number   of  vectors,  (  t.,  z.  )  ,  stored  in  the  computer's  memory  was  40 
and  the  computer  time  necessary  (including  the  time  required  to  compute  f  (  x     )    ) 
was  4.36  seconds. 
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f(x) 


*■      X 


Figure  15.    Graph  of  f  (  x  )  =    3+x-x 
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B.     SAMPLE  PROBLEM    II 
Given  the  function 

f  (x  )=2Z1     sin  [  -  (  i  +  1  )sTx"   +    i  ], 
i  =  I 

estimate  the  global  maximum  cp  and  its  location  in  [  .01,    10  ]  with  a  desired 
accuracy  of  g  =    .01 . 
1.      Discussion 

The  results  of  Sample  Problem  I  seem  to  indicate  that  the  computerized 
algorithm  under  consideration  is  quite  successful  in  maximizing  a  unimodal  function. 
It  remains  to  be  seen  how  the  algorithm  reacts  to  functions  which  are  multimodal  in 
nature.    The  problem  described  above  was  used  to  test  the  algorithm's  ability  to 
maximize  multimodal  functions. 

The  plotting  package  of  an  IBM  360/57  computer  was  used  to  plot  f  in 
[.01,  10  ],  see  Fig.   16.     It  is  obvious  from  Fig.   16  that  f  is  multimodal  in  [.01,    10  ] 
and  that  the  global  maximum  occurs  in  the  interval  [.01,   1  ].    Although  the  func- 
tion   f  is  differentiable,   it  is  difficult  to  obtain  the  exact  global  maximum  and  its 
location  using  the  techniques  of  differential  calculus.     Hence,  gradient  techniques 
were  applied  in  [  .01,    1.0  ]  to  obtain 


9 


=    12.0313  .... 


at  Y   =    {   0.2416  ...  }    . 

The  fact  that  f  is  differentiable  allows  C  to  be  computed  as 

5  -1/2 

C  =   max 

xe   [  .01,   10] 


y     -  (  i  +  1  )   x 
i  =  1 


350. 


For  sake  of  illustration,  the  sequence  of  sampling  points  x     vs.  the  iteration 

n 

number  n  were  plotted,  see  Fig.   17.     It  shows  how  the  search  soon  abandons 
regions  of  [  a,   b  ]  where  f  (  x  )  is  low  and  concentrates  on  search  in  the  vicinity 
of  maxima.    The  number  of  vectors  stored  vs.  the  number  of  iterations  required 
were  also  plotted    to  illustrate  the  effects  of  the  data  reduction  technique  utilized 
by  the  proposed  algorithm,  see  Fig.   18. 
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Figure  16.    Graph  of 

f(x)  =v~*    ' sin  [  -  ( i  + 1 )  >nr  +  i  ]. 
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2.      Results 

For  the  desired  accuracy  of  e  =    .01/  the  algorithm  required  890  iterations 

giving  the  estimate  ^      =     12.  0312  .    Since  the  number  of  global  maxima  was  deter- 

e 

a  priori  from  Fig.   16  to  be  1,  Alternative  1  was  again  used  to  estimate  the  location 

of  cp   .    x  was  estimated  as  0.2415    .    The  largest  number  of  vectors,   (  t.,  z.  )  , 

stored  in  the  computer's  memory  was  418  and  the  total  computer  time  (including 

the  time  to  compute  f  (  x     ))  was  23.96  seconds. 

n 

C.     SAMPLE  PROBLEM  III 

Given  the  function 
5 
f  (x  )    =2      \      '  sin  [  (  i  +  1)    x    +    i  ], 
i  =  1 

estimate  the  global  maximum  cpand  its  location  in  [  -10,   10  ]  with  a  desired  accura- 
cy of  e   -    .01 . 

1.      Discussion 

So  far  the  method  under  consideration  appears  to  be  successful  in  maxi- 
mizing functions  with  a  single  global  maximum.    The  function  under  consideration 
was  used  to  determine  the  algorithm's  ability  to  maximize  functions  with  more  than 
one  global  maximum. 

A  computer  plot  of  f  was  used  to  determine  the  nature  of  the  function, 
see  Fig.   19.     It  appears  from  Fig.   19  that  f  has  more  than  one  global  maximum  in 
the  experimental  region.    The  function  is  differentiable,  but  too  difficult  to  locate 
the  maximum  using  calculus.    Gradient  techniques  were  again  applied  in  the  inter- 
vals assumed  to  contain  the  maximum.    These  methods  located  the  maximum 

cp  =    12.0313  ... 

at  y  =    {  -6.7745  ...,  5.7918  ...  }    . 

C  was  computed  as 

C=   max  {    |    i   (  i  +  1  )    |    }   =    70. 

xe  [-10,   10] 
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Figure  19.    Graph  of  f  (  x  )  =  Jfzi]      i  sin    [  (  i  +  1  )    x  +  i   ]  . 
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2.     Results 

For  the  desired  accuracy  of  c  =    .01,  the  algorithm  required  444  iterations 
giving  the  estimate  c     =    12.0313.     Examination  of  Fig.   19  revealed  the  possibility 
of  more  than  one  global  maximum  in  [  -10,   10  ],  hence,  Alternative  IV  was  used 
to  locate  cp  in  this  interval.    The  residual  uncertainty  in  the    location  was  reduced 
to  three  intervals 

W  =    {   [  -6.7907,  -6.7595  ],  [  -0.5129,  -0.4261  ], 
[5.7749,  5.8061  ]    } 

(there  is  a  local  maximum  f  (  x  )  =    12.  0312  . . .  at  x  =  -0.4914  . . .  .)    The 
largest  number  of  vectors,  (  t.,  z.  )  ,  stored  in  the  computer's  memory  was  less  than 
250  and  the  total  computer  time  was  13  seconds. 

D.     SAMPLE  PROBLEM  IV 

Given  the  function 

5  __ 

f  (x  )=  Y~\       ■  sin    [  -(i  +  1  )>Tx~+i  ], 
i  =  1 

estimate  the  zeros  of  f  in  [  .01,   10  ]  with  a  desired  accuracy  of  e  =    .01. 
1.      Discussion 

This  problem  was  used  to  test  the  algorithm's  ability  to  locate  zeros  of  a 
function.    The  function 

0(x)  =  -|f (x) I 

was  plotted  by  the  computer  in  an  effort  to  determine  the  intervals  in  [  .01,    10  ] 
where 

max  g  (  x  )  =  0 

xe    [  .01,   10] 

(zero  will  be  the  global  maximum  of  g  as  described  in  Chapter  IV.)    The  location 
(s)  of  the  global  maxima  m=  0  are  the  zeros  of  f.    Figure  20  was  used  to  approxi- 
mate starting  points  for  gradient  methods  in  order  to  locate  the  true  zeros  of  f .    The 
zeros  were  determined  as  (  0.0215  ...,  0.0000),  (  0.6180  ...,  0.0000), 
(2.0795  ...,  0.0000),    (  4.2232  ...,  0.0000)  ,  (  6.3051   ...,  0.0000  ), 
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Figure  20.    Graph  of  g  (  x  )  =  -  |  f — '    i  sin  [  -  (  i  +  1  )       x    +    i    ] 
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(9.0865  . ..,  0.0000),  C  was  computed  as 

1 
C  =   max  5  "  "2 

xe  [  .01,   10]  { 


C  (  '  +  1  )  x  I  }  =  350- 

i  =  1  ' 


2 .     Resu  I  ts 


For  the  desired  accuracy  of  e  =    .01,  the  algorithm  required  2253  itera- 
tions.   Alternative  IV  was  used  to  estimate  the  locations  of  the  zeros.    The  residual 
uncertainty  in  the  location  was  reduced  to  six  intervals 

[0.0214,0.0239]    ,    [0.6173,0.6186], 

[2.0795,2.1173],    [4.2280,4.4323] 

[  6.2246,  7.2566  ]    ,    [  8.8457  ,  9.1452  ]  . 
The  largest  number  of  vectors  (  t.,  z.  )  stored  was  474  and  the  total  computer  time 
was  52.87  seconds. 
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VI.     EXPERIMENT  WITH  THE  COMPUTERIZED  ALGORITHM 
A.     PROCEDURE 

The  following  experiment  was  performed  to  test  the  algorithm's  sensitivity  to 
the  LIpschltzIan  constant  and  the  shape  of  the  criterion  function.  To  obtain  data 
for  the  experiment,  eighteen  functions  of  the  form 

f  =  fmax{   10  [    ]-(±zJY)}      if   o*x    *    10 


max 


[«  [    1- 


(-): 


]   }     if    -10  <  x    <   0 


were  used  with  the  following  ranges  on  the  parameters: 


0s  0   <    10  , 
0<  3  ^     5  , 


0  < 


5. 


Figure  21  illustrates  the  general  form  of  the  function  f  and  the  eighteen 

functions  with  their  exact  parameters  are  illustrated  in  Fig .  22. 

C    was  computed  as 


C  -    max 

x  e    [  -10,   10] 


d  f 


<^,v 


dx 


The  Lipschitzian  constant  C  was  allowed  to  vary  from  Its  minimum  value  defined  by 
(17)  to  five  times  its  minimum  value  for  each  of  the  eighteen  functions.    The  com- 
puterized algorithm  was  used  to  maximize  each  variation  of  f  and  the 
corresponding  results;  namely,  the  number  of  iterations  required  and  the  largest 

number  of  vectors  in  D    stored,  were  recorded  in  Table  II.    All  results  are  based 

n 

on  a  desired  accuracy  of  e    =    .01 . 
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B.     EXPERIMENTAL  RESULTS 

A  brief  description  of  Table  II  follows: 

Cmin     =   max  []0O_  ]0a[  (18) 

xe  [-10,   10] 

defined  by  (17).    i    C     .    ,     i  =  1,  2,  3,  4,  5  were  the  variations  of  C  used  by  the 

mm 

algorithm  to  estimate  the  global  maximum  m  for  the  functions  1  -  18  specified  by 
the  shape  parameters  a  ,   ?  ,  y    .    Figure  22  illustrates  the  eighteen  functions  under 
consideration.    The  data  points  are   the  number  of  iterations  required  and  the  larg- 
est number  of  vectors  in  D     for  the  corresponding  i  C    .      and  function. 

n  mm 

The  columns  of  Table  II  and  the  graphs  of  the  tabulated  results,  Fig.  23  and 
Fig.  24,  seem  to  confirm  the  theoretical  conjecture  that  the  number  of  iterations 
and  the  number  of  vectors  stored  increase  approximately  in  direct  proportion  to 

i   C    .    . 

mm 

The  effects  of  the  shape  of  the  criterion  function  are  not  so  readily  apparent  from 
Table  II  and  Figs.  23  and  24.     In  order  to  obtain  these  results  it  is  necessary  to  de- 
termine the  value  of  the  data  points  for  each  function  with  different  shape  para- 
meters evaluated  for  the  same  constant  C.    From  Table  II  it  is  obvious  that  the 
common  C  would  have  to  be  100  since 

max  C    .     =    100  . 
mm 

Unfortunately  time  was  not  available  to  compute  the  data  points  for  each  function 

using  a  common  C.     In  an  effort  to  use  the  available  data  to  estimate  the  algorithm's 

sensitivity  to  function  shape,   it  was  hypothesized  that  fori    =1 

max    C    . 

mm 


C    . 

mm 


where  I  is  the  number  of  iterations  required  when  C  =  C    .    ,  would  approximately 

mm 

equal  the  number  of  iterations  required  if  C    .     =   max  C    .    .    Similarly,  for  i  =  1 , 
n  n  mm  mm 

it  was  hypothesized  that 

max  C    •  _ 

mm  S 


C    . 

mm 
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where  S  is  the  largest  number  of  vectors  in  D    when  C  =  C        ,  would  approximately 

n  min  rr  r 

equal  the  largest  number  of  vectors  in  D    required  if  C    .     =   max  C        .    This  hypo- 

n  mm  min  /r 

thesis  seems  plausible  since  both  iterations  required  and  amount  of  vector  storage 

required  is  approximately  in  direct  proportion  to 

max    C     . 

mm 


C    . 

mm 

The  results  based  on  this  hypothesis  are  tabulated  in  Table  III.  Even  from  this 
table  it  is  not  readily  apparent  how  the  shape  of  the  function  affects  the  algorithm, 
due  to  the  interaction  of  the  parameters.    However,  the  data  points  obtained  for 
each  function  can  be  compared  to  their  corresponding  graphs  in  Fig.  22  to  anaiyze 
these  effects. 

The  data  points  of  Table  HI  have  been  rearranged  in  nonincreasing  order  and 

tabulated  in  Table  IV.    Table    IV  makes  the  comparisons  between  data   points  and 

corresponding  figures  a  little  more  systematic.    From  these  comparisons  it  becomes 

obvious  that  the  largest  number  of  iterations  required  and  the  largest  number  of 

vectors  required  to  be  stored  occur  when  the  function  is  relatively  flat  in  the  area 

of  the  global  maximum  cp    .    Furthermore,  one  can  see  by  a  similar  comparison  that 

as  other  local  maxima  less  than  or  equal  to  the  global  maximum    cp  get  closer  to    cp 

the  iterations  and  vectors  in  D     increase  accordingly. 

n 

C.     CONCLUSIONS 

It  was  concluded  from  experimental  results  that  the  algorithm  is  most  sensitive 
to  the  value  of  the  Lipschitzian  constant  C.     If  the  value  of  C  selected  by  the  ex- 
perimenter (call  this  value  C  )  is  greater  than  C    .     defined  by  (18)  then  the  search 

s  mm        ,         v 

for  the  global  maximum  cp  will  require  approximately  /      s     \  times  as  many  itera- 

\     rn|n 

tions  and  vector  storage  required  than  would  be  required  if  C    =    C    .    .    Further- 

s  mm 

more,  as  the  number  of  iterations  increase  the  amount  of  computer  time  necessary  to 

make  the  computations  increase  proportionately.    Similarly,  as  the  number  of  vectors 

in  D     increase  the  amount  of  computer  core  memory  necessary  to  store  the  vectors 

increases  proportionately.    Hence,  since  computer  cost  is  a  function  of  time  and 
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core  required,  the  cost  of  the  experiment  will  increase  proportional  to  C    .    Clearly, 

experimental  costs  will  be  optimized  when  C    =    C    .    . 

s  mm 

If  the  value  for  C  cannot  be  obtained  analytically  and  the  nature  of  the  func- 
tion is  unknown,  the  experimenter  must  be  cautious     in  his  selection  of  C  .     If 

s 
C     <    C    .     it  is  possible  that  the  algorithm  will  fail  completely.    On  the  other  hand 

the  algorithm  will  always  work  if  C     >    C    .    ,  but  at  a  higher  cost.    Therefore,   if 

s  mm  ' 

the  exact  value  for  C    .     cannot  be  determined  a  priori,  it  would  be  better  for  the 

mm 

experimenter  to  risk  an  over  estimate  of  C  at  a  greater  cost  than  to  underestimate 

and  accomplish  nothing. 

It  can  also  be  concluded  from  the  experimental  results  that  the  algorithm  is 
sensitive  to  the  function's  shape,  though  not  as  sensitive  as  it  is  to  the  Lipschitzian 
constant.     In  regards  to  shape,  the  algorithm  is  most  sensitive  to  how  flat  the 
function  is   in  the  area  of  the  global  maximum.     Of  secondary  importance,  but  still 
significant  is  the  height  of  all  local  maxima,  in  the  experimental  region,   less  than 
the  value  of  cp    .    As  the  heights  of  these  local  maxima  approach  cp   ,  the  computa- 
tion cost  increases  accordingly.     Furthermore,  though  not  established  by  experimental 
results,  it  seems  plausible  that  as  the  number  of  local  maxima  relatively  close  to  ^ 
increase  the  cost  will  increase  accordingly. 

Knowledge  of  the  shape  of  the  function  to  be  maximized  may  be  useful  for 
making  gross  cost  estimates  or  as  a  programming  aide. 
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VII.     RESULTS 

A.     RESULTS  OF  SAMPLE  PROBLEMS 

Consider  Sample  Probiem  III  of  Chapter  V.    The  grid  search  would  require 

6N 

sampling  points  for  the  same  accuracy  of  e  =  .01,  where  d  =  |   (  -10  )  -   (  10  )    |  =  20 
and 

6N     =        2  e  -02    =    .000  285  ... 

—  "      70 

Thus,  the  number  of  sampling  points  required  by  the  grid  search  is  approximately 
p    =    7   x    104 

while  the  number  of  sampling  points  required  by  the  sequential  method  proposed  by 
this  research  was  less  than  250.    Similar  comparisons  can  be  made  for  the  other 
sample  problems  and  the  results  would  be  relatively  the  same,  namely,  the  number 
of  sampling  points  required  would  be  considerably  higher  for  grid  search  than  for 
the  sequential  method  proposed  by  this  research. 

Purely  random  methods  would  require  about  twice  as  many  sampling  points  as 
the  grid  search  for  the  same  accuracy  and  for  a  sufficient  confidence  level . 

This  indicates  that  for  nontrivial  functions,  the  convergence  rate  of  the  algorithm 

tends  to  be  much  faster  than  the  slowest  theoretical  rate    C    (  b  -  a  ) 

n 

B.     CONSEQUENCES  OF  MEASUREMENTS  ERRORS  OR  SMALLER  C  THAN 
REQUIRED 

Figure  25  illustrates  the  consequence  of  a  measurement  error,  in  this  case, 
when  the  measured  value  of  the  function  is  less  than  the  actual  value  in  the  region 
containing  the  maximum.    Figures  26  and  27  illustrate  the  consequences  of  a  smal- 
ler C  than  required  by  the  algorithm. 

In  both  cases  F     (  x  )  may  fail  to  be  an  upperbound  to  f  (  x  )  in  some  regions  of 
n 

the  domain  [  a,  b  ].    Hence,  some  regions  may  be  excluded  from  future  search.     If 
these   regions  happen  to  contain  points  where  the  global  maximum  is  attained,  the 
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wrong 
i  measurement 


nj+  1 


~ Z) 

No  more  samples  in  this  interval 

Figure  25.    Consequences  of  measurement  errors, 


n+2 
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A 

9 


True  location 


xn  +  1       Excluded  from  search 
Figure  26.    Case  where  smaller  C  than  required  is  not  recognized  by  data  set. 
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Dropped  from  D 


this  interval 
Figure  27.    Case  where  C  is  underestimated  but  has  no  effect  on  algorithm 
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algorithm  will  result  in  error,  see  Figs.  25  and  26.    In  both  of  the  cases  illustrated, 
(Figs.  25,  26),  the  value  of  cp  will  be  underestimated  and  the  location  will  be 
wrong.    On  the  other  hand  it  is  possible  for  the  algorithm  to  be  successful  even  if 
C  is  less  than  required,  see  Fig.  27.     If  the  portion  of  the  function  that  requires  C 
to  be  large  lies  considerably  below  the  global  maximum  it  is  still  possible  to  obtain 
desired  results  from  the  algorithm. 

The  errors  may  be  detected  in  some  cases  by  obtaining 

f<»H)>zH 

n  n 

at  some  point  in  the  sampling  sequence,  however,  this  need  not  necessarily  happen, 
see  Fig.  26. 
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VIII.     CONCLUSIONS 

The  results  from  the  comparisons  made  between  the  sequential  method  proposed 
by  this  research  and  the  nonsequential  methods  of  grid  and  random  search  clearly  indi- 
cate that  the  proposed  sequential  metnod  is  preferable  to  the  nonsequential  methods, 
especially  in  cases  where  the  function  to  be  maximized  is  difficult  or  time  consuming 
to  evaluate  or  when  the  sampling  of  a  function  is  costly. 

The  primary  advantage  the  proposed  sequential  maximization  method  has  over 
other  methods  that  are  sequential,  such  as  gradient  and  min-max  methods,  is  that 
the  proposed  method  is  not  restricted  by  the  modality  of  the  function,  nor  does  it 
rely  on  the  regulairty  conditions  which  must  be  satisfied  by  these  other  sequential 
methods. 
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IX.     RECOMMENDATIONS  FOR  FUTURE  RESEARCH 

The  sequential  search  procedure  proposed  by  this  research  has  proved  success- 
ful in  estimating  the  global  maximum  and  its  location  of  a  function  whose  exact  form 
in  the  experimental  region  was  known.     It  is  obvious  that  the  same  method  could  be 
applied  to  functions  whose  exact  form  is  unknown.     It  is  the  recommendation  of  the 
author  that  future  research  be  conducted  in  this  area  by  investigating  practical 
examples  of  the  following  type: 

Consider  the  "black  box"  case,  where  the  output  of  the  box  is  a  function  of  a 
single    independent  input  variable  ranging  over  the  experimental  region.    The 
selection  of  the  Lipschitzian  constant  may  be   difficult  in  this  situation  but  may  be 
obtained,  by  the  experimenter's  previous  knowledge  of  the  system  under  investiga- 
tion or  from  experts  who  are  familiar  with  the  system.    By  data  linking  the  input/ 
output  signals  of  the  box  to  the  on  line  computerized  algorithm  the  same  results  can 
be  obtained.    See  Fig.  28  for  an  illustration  of  this  approach. 

The  algorithm  proposed  by  this  research  could  also  be  experimentally  investiga- 
ted in  light  of  maximizing  discrete  functions,  which  may  lead  to  several  interesting 
practical  problems. 

In  principle,  it  appears  that  the  method  could  be  extended  to  functions  of 
several  variables,  but  at  present  it  does  not  seem  to  be  computationally  feasible. 
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Computerized 
Algorithm 
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Figure  28.     Illustration  of  "black  box"  case. 
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APPENDIX  A 

FLOW  CHART  FOR  COMPUTER    PROGRAM 
WHICH  ESTIMATES  THE    GLOBAL  MAXIMUM 
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(      Start    J 


Read 
f(x) 

a,  b,  c,    e 


xQ   =    1/2  (  a  +  b  ) 

y0;  f  (  X0  » 


Vo  -  c  <°-V 

Zb  =  yQ    +     C    (  b  -  x    ) 
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Iteration 


1 


_  _  _  j 
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APPENDIX    B 

FLOW  CHART  FOR  SUBPROGRAM  ESTIMATING  LOCATION 
OF  GLOBAL  MAXIMUM  BY  ALTERNATIVE  II 
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DO 
1  ,  k 
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Print 
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APPENDIX    C 

FLOW  CHART  FOR  SUBPROGRAM  ESTIMATING  LOCATION 
OF  GLOBAL  MAXIMUM  BY  ALTERNATIVE    III 


(     Start     J 


"1 


i  Call 

Subroutine      , 
1       ARANGE 
I I 


x,i=  r.-[zj-*6  ]/c 


Yes 


Print 

V 


1 

No 

Print 

< 

i 

i  =  i 

Hf 

k 

=  2 

95 


0 


Xr.      =     t;    +    (   Zj      -<^€J     /     C 
Xl.       =     *.       '   t  Zk    "^     /C 

'k  k  K 


Yes 


xr.    =  tj     +   [z.    -<j>e]    /   C 


Yes 


Print 
Xri 


Print 

xr.       Xi 
I         'k 


-H» 


i  =  i  +  i 
k  =  k  +  1 


Yes 


Print 

xr.    =   b 
I 


Stop 


No 


96 


APPENDIX  D 

PROGRAM  LISTING  FOR  FINDING  THE  GLOBAL  MAXIMUM  OF 
A  DETERMINISTIC  FUNCTION  OF  A  SINGLE  REAL  VARIABLE 

C      SUBROUTINE  FOR  REARRANGING  THE  VECTORS  IN  THE  DATA  SET 
C      IN  NONDECPEASING  ORDER  OF  SECOND  COMPONENT 


SUBROUTINE  L4RGE(A,B,N) 

DIMENSION  A,B,G,H 

X=A(N) 

Y=B(N-1) 

Z=B(N) 

J  =  l 

IF(N.E0.2)GO  TO  5 

1  IF(A(N)  .GE.A(J)  )GO  TO  A 
L  =  N-2 

DO  2  I=J,L 
G<  I  )=A(  I) 
H( I  )  =  B(  I) 

2  CONTINUE 
LA=J+2 

DO  3  M=LA,N 
A(M)=G(M-2) 
B(M)=H(M-2) 

3  CONTINUE 
A( J)=X 
B( J)=Y 
A( J+1)=X 
B( J+1)=Z 
GO  TO  6 

4  J=J+1 
IF(J.GT.N)GO  TO  6 
GO  TO  1 

5  IF(A(N)  .GE.A( J) )GO  TO  6 
TEMP=A( J) 

A(  J)=X 
A(N)=TEMP 
TEMP=B( J) 
B(  J)=Z 
B(N)=TEMP 

6  RETURN 
END 

C      MAIN  PRCGRAM 

DIMENSION  XtY,Z 

READ  THE  FUNCTION  TO  BE  MAXI MI  ZED, F ( XH ) 

C      FUNCTION  SUBPROGRAMS  FCR  LOCATING  THE  TWO  NEW  PEAKS  OF 
C      FK(XH)  WHEN  K  IS  GREATER  THAN  2 

ZHI(ZH,YH)=(ZH+YH) /2. 
XHI1(XH,ZH,YH) =XH+( ZH- YH ) / ( 2 .*C ) 
XHI2(XhTZH,YH)=XH-( ZH-YH)/ (2.*C) 

READ  PARAMETERS  A,B,C,EPS 

C      ZERO(TH)  ITERATION 

XI=(A+B)/2. 

YI  =  F(XI  ) 

PHIHAT=YI 

Z(2)=YI+C*(R-XI) 

X(1)=A 

X(2)=B 
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C      FIRST  ITERATION 

XH=X(1) 
Y(1)=F(XH) 
ZH=Z( 1) 
YH=Y( 1) 

C      ESTIMATE  GLOBAL  MAXIMUM 

IF(YH.GT.Yl)GO  TO  1 

PHIHAT=YI 

GO  TO  2 

1  PHIHAT=YH 

C      COMPUTE  LOCATION  OF  NEW  PEAK 

2  Z  (  3  )  =  Z  H I  (  Z  H  t  YH  ) 
X(3)=XHI1( XH,ZH,YH) 

C      DROP  VECTOR  FROM  DATA  SET  WITH  HIGHEST  SECOND 
C      COMPONENT  AND  ADD  NEW  VECTOR 

zm=zm 

X( 1)=X(3) 

C      SET  K  EQUAL  TO  THE  NU^eER  OF  VECTORS  IN  THE  DATA  SET 

K  =  2 

C      REARRANGE  VECTORS  IN  DATA  SET  IN  ORDER  OF  NON- 
C      DECREASING  SECOND  COMPONENT 

CALL  LARGE(Z,X,K) 

C      SECOND  ITERATION 

XH=X(2) 
Y(2)=F(XH) 
ZH=Z(2) 
YH=Y(2) 

C      ESTIMATE  GLOBAL  MAXIMUM 

IF(YH.GE.PHIHAT)GO  TO  3 
GO  TO  4 

3  PHIHAT=YH 

C      COMPUTE  LOCATION  NEW  PEAK 

4  Z(3)=ZHI(ZH,YH) 
X(3)=XHI2( XH,ZH,YH) 

C      DROP  VECTOR  FROM  DATA  SET  WITH  HIGHEST  SECOND 
C      COMPONENT  AND  ADD  NEW  VECTOR 

Z(2)=Z(3) 
X(2)=X(3) 

C      SET  K  EQUAL  TO  THE  NUMBER  OF  VECTORS  IN  THE  DATA  SET 

K=2 

C      REARRANGE  VECTORS  IN  DATA  SET  IN  ORDER  OF  NON- 
C      DECREASING  SECOND  COMPONENT 

CALL  LARGE(Z,X,K) 

C      K(TH)  ITERATION 

C      COMPUTE  LOCATION  OF  TWC  NEW  PEAKS 
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5     Z(K+l)=ZHI(ZHt YH) 

X(K+1 )=XHI1(XH,ZH,YH) 
Z(K+2) =ZHI(ZH, YH) 
X(K+2) =XHI2( XH,ZH,YH) 

C      DROP  VECTOR  FROM  DATA  SET  WITH  HIGHEST  SECOND 
C      COMPONENT  AND  ADD  NEW  VECTOR 

Z(K)=Z(K+2) 
X(K)=X(K+2) 

C      SET  K  EQUAL  TO  THE  NUMBER  OF  VECTORS  IN  THE  DATA  SET 

K  =  K+1 

C      REARRANGE  VFCTOPS  IN  DATA  SET  IN  ORDER  OF  NON- 
C      DECREASING  SECOND  COMPONENT 

CALL  LARGE(Z,X,K) 

C      SET  YHAT  EQUAL  TO  THE  SAMPLE  VALUE  ^OR  THE  ABSCISSA 
C      OF  THE  HIGHEST  PEAK  IN  THE  DATA  SET 

YHAT=F(X(K) ) 

C      ESTIMATE  GLOBAL  MAXIMUM 


IF(YHAT.GE.PHIHAT)GO  TO  6 

GO  TO  7 

6 

PHIHAT=YHAT 

C 

REDUCTION  OF  DATA 

7 

N=0 

D09  1=1, K 

IF(Z( I  )  .LT.PHIHAT)GO  TO  9 

N=N+1 

IF(N.EQ.l)GO  TO  8 

X(N)=X( I ) 

Z(N)=Z( I) 

GO  TO  9 

8 

INDEX=K-I+1 

X(N)=X( I ) 

Z(N)=Z(I) 

9 

CONTINUE 

C      DETERMINE  THE  NUMBER  OF  VECTORS  LEFT  IN  THE  DATA  SET 

IF(N.NE.INDEX)GO  TO  10 
K=INDEX 

C  BEGIN    (K  +  DST    SAMPLE 

10  XH=X(K) 
Y(K)=F(XH) 
ZH=Z(K) 
YH=Y(K) 

C      STOPPING  RULE 

IFUZ(K)-PHIHAT)  .LE.EPSJGO  TO  11 
GO  TO  5 

11  PRINT, PHIHAT 

C      GO  TO  LCCATION  OF  GLOBAL  MAX  SUBPROGRAM 
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APPENDIX  E 

SUBPROGRAM  LISTING  <=OR  ESTIMATING  THE  LOCATION  OF  THE 
GLOBAL  MAXIMUM  BY  ALTERNATIVE  II 

C      EVALUATE  THE  FUNCTION  FOR  EACH  FIRST  COMPONENT  OF  DATA 
C      SET  Dc 

DOJ=l,K 

Y( J)=F(X( J)) 

C      LOCATE  THE  COORDINATES  cOR  WHICH  THE  VALUES  OF  THE 
C      FUNCTION  ARE  GREATER  THAN  OR  EQUAL  TO  THE  ESTIMATED 
C      MAXIMUM  OBTAINED  FROM  tH£  MAIN  PROGRAM 

IF(Y( J  ).GE.PHIHAT)GO  TO  13 
GO  TO  15 

13  PRINT  (X( J) tY( J) ) 

14  FORMAT 

15  CONTINUE 
STOP 
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APPENDIX  F 

SUBPROGRAM  LISTING  FOR  ESTIMATING  THE  LOCATION  OF  THE 
GLOBAL  MAXIMUM  BY  ALTERNATIVE  III 

C      SUBROUTINE  FOP  REARRANGING  ALL  VECTORS  IN  DATA  SET  D£ 
C      IN  NONCECREASING  ORDER  OF  FIRST  COMPONENT 


SUBROUTINE  AR ANGE( A ,B , N) 
DIMENSION  A(N) , B(N) 
L=N-1 

DO  10  J=1,L 
JP1=J+1 
DO  10  I=JP1,N 
IF(A( J  ).LE.A( I ) )GOT010 
TEMP=A( J) 
A( J)=A( I ) 
A(  I  )=TEMP 
TEMP=B( J) 
B( J)=B(  I  ) 
B( I  )=TEMP 
10    CONTINUE 
RETURN 
END 

C      REARRANGE  DATA  IN  D€  IN  ORDER  OF  NONDECRE AS  I NG  FIRST 
C      COMPONENT 

30  CALL  ARRANGE(X,Z,K) 

C      DETERMINE  THE  LEFTMOST  ENDPOINT  OF  THE  GENUINE 
C      UNCERTAINTY  SET  V 

XL(1  )=X(1)-(Z(  D-PHIHATJ  /C 
IF(XL(  D.LT.AJGO  TO  31 
GO  TO  3  2 

31  XL(1)=A 

32  PRINT  XL(1) 
00    FORMAT 

J=l 
L  =  l 

C      DETERMINE  THE  RIGHT  AND  LEFT  ENDPOINTS  OF  REMAINING 
C      INTERVALS  IN  THE  SET  V 

34  XR( J)=X( J)+Z( J)-PHIHAT)/C 
XL(L)=X(L)-(Z(L)-PHIHAT)/C 
IF(XR( J) .LT.XL( L))GO  TC  35 
GO  TO  27 

35  PRINT,  XR(J)tXLU) 

36  FORMAT 

37  J=J+1 
L  =  L  +  1 

IF(L.GT.K)GO  TO  38 
GO  TO  34 

C      DETERMINE  RIGHTMOST  ENDPOINT  OF  V 

38  XR( J)=X( J)+(Z( J)-PHIHAT) /C 
IF(XR( J)  .GT.B)GO  TO  39 

GO  TO  40 

39  XR(J)=E 

40  PRINT,  XR(J) 

41  FORMAT 
STOP 
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